[shimV2] adds core pod, container and process controllers#2674
[shimV2] adds core pod, container and process controllers#2674rawahars merged 2 commits intomicrosoft:mainfrom
Conversation
a48db21 to
6e00070
Compare
Adds the `internal/controller` package hierarchy with three new sub-packages that provide lifecycle management for Linux Containers on Windows (LCOW): - `linuxcontainer`: manages the full lifecycle of a single LCOW container inside a Utility VM, including host-side resource allocation (SCSI layers, Plan9 shares, vPCI devices), guest-side container creation via the GCS, and state machine transitions. - `pod`: manages a single pod running inside a UVM, owning the network controller and tracking all container controllers belonging to the pod. - `process`: manages individual process (exec) instances within a container, handling IO plumbing, signal delivery, exit status reporting, and a linear state machine. Each package includes comprehensive unit tests, mock types, and documentation. Signed-off-by: Harsh Rawat <harshrawat@microsoft.com>
helsaawy
left a comment
There was a problem hiding this comment.
not sure about the proliferation of controllers that will be allocated
| ) | ||
|
|
||
| // Controller is the concrete implementation of the LCOW container controller. | ||
| // It manages the full lifecycle of a single LCOW container. |
There was a problem hiding this comment.
do we need a controller per container? can we not follow the model we have for devices/* and have one controller for all containers?
i doubt that each container needs to track its [scsi|plan9|vPCI]Controller
There was a problem hiding this comment.
similar for the process controller; having one controller for all processes in a container would probably be cleaner
There was a problem hiding this comment.
Collapsing all containers/processes into a single controller would make workflow cumbersome and introduce a host of problems. The two patterns (linuxcontainer vs device controllers) are solving different problems:
Device singletonsexist to dedupe + refcount shared host resources — multiple containers can reference the same SCSI disk, so one controller centralizes the reservation/refcount and ensures the VM-level attach happens once.- Containers don't have any shared state to dedupe. Every field on
linuxcontainer.Controlleris strictly per-container:the GCS container handle,state machine,terminatedCh,processes map,per-container reservation lists, and a mutex scoped to that container's lifecycle. Two containers never refer to "the same container" the way they do the same disk.- If we collapsed it, we'd end up with
map[cid]*containerStateinside one big controller — i.e. the same struct moved one level down. No real consolidation, but real costs:Locking gets worse— either one big mutex held across long GCS calls, or per-entry sub-locks (which is what we already have).Lifecycle ownership blurs— todaypod.Controller=registry,linuxcontainer.Controller=container lifecycle,process.Controller=process lifecycle. Each has its ownWait/teardownworkflow. Merging the middle layer conflates registry with lifecycle owner.
- The "central tracker" already exists — it's
pod.Controller. Cross-container ops (kill --all,sandbox-deleteprecondition checks) already go throughpodCtrl.ListContainers()inservice_task_internal.gowithout forcing containers to share a controller.
- If we collapsed it, we'd end up with
TL;DR: device singletons fit shared, refcounted host resources; containers are independent entities with their own state machines, and pod.Controller already plays the central-registry role.
Also, I hear your concern about the proliferation of the controllers but it's not going to be worse than what we already have with containerd-shim-runhcs-v1. hcsTask corresponds to a single container and hcsExec corresponds to a single exec/process.
Adds the
internal/controllerpackage hierarchy with three new sub-packages that provide lifecycle management for Linux Containers on Windows (LCOW):linuxcontainer: manages the full lifecycle of a single LCOW container inside a Utility VM, including host-side resource allocation (SCSI layers, Plan9 shares, vPCI devices), guest-side container creation via the GCS, and state machine transitions.pod: manages a single pod running inside a UVM, owning the network controller and tracking all container controllers belonging to the pod.process: manages individual process (exec) instances within a container, handling IO plumbing, signal delivery, exit status reporting, and a linear state machine.Each package includes comprehensive unit tests, mock types, and documentation.